Frequent Hypergraph Mining
نویسندگان
چکیده
The class of frequent hypergraph mining problems is introduced which includes the frequent graph mining problem class and contains also the frequent itemset mining problem. We study the computational properties of different problems belonging to this class. In particular, besides negative results, we present practically relevant problems that can be solved in incremental-polynomial time. Some of our practical algorithms are obtained by reductions to frequent graph mining and itemset mining problems. Our experimental results in the domain of citation analysis show the potential of the framework on problems that have no natural representation as an ordinary graph.
منابع مشابه
A Data Mining Formalization to Improve Hypergraph Minimal Transversal Computation
Finding hypergraph transversals is a major algorithmic issue which was shown having many connections with the data mining area. In this paper, by defining a new Galois connection, we show that this problem is closely related to the mining of the so-called condensed representations of frequent patterns. This data mining formalization enables us to benefit from efficient algorithms dedicated to t...
متن کاملTransaction Databases, Frequent Itemsets, and Their Condensed Representations
Mining frequent itemsets is a fundamental task in data mining. Unfortunately the number of frequent itemsets describing the data is often too large to comprehend. This problem has been attacked by condensed representations of frequent itemsets that are subcollections of frequent itemsets containing only the frequent itemsets that cannot be deduced from other frequent itemsets in the subcollecti...
متن کاملDeciding Monotone Duality and Identifying Frequent Itemsets
The monotone duality problem is defined as follows: Given two monotone formulas f and g in irredundant DNF, decide whether f and g are dual. This problem is the same as duality testing for hypergraphs, that is, checking whether a hypergraph H consists of precisely all minimal transversals of a hypergraph G. By exploiting a recent problemdecomposition method by Boros and Makino (ICALP 2009), we ...
متن کاملPattern Mining for General Intelligence: The FISHGRAM Algorithm for Frequent and Interesting Subhypergraph Mining
Fishgram, a novel algorithm for recognizing frequent or otherwise interesting sub-hypergraphs in large, heterogeneous hypergraphs, is presented. The algorithm’s implementation the OpenCog integrative AGI framework is described, and concrete examples are given showing the patterns it recognizes in OpenCog’s hypergraph knowledge store when the OpenCog system is used to control a virtual agent in ...
متن کاملIdentification in a Text Corpus
TopCat (Topic Categories) is a technique for identifying topics that recur in articles in a text corpus. Natural language processing techniques are used to identify key entities in individual articles, allowing us to represent an article as a set of items. This allows us to view the problem in a database/data mining context: Identifying related groups of items. This paper presents a novel metho...
متن کامل